Calibrating a network meta-analysis of diabetes trials of sodium glucose cotransporter 2 inhibitors, glucagon-like peptide-1 receptor analogues and dipeptidyl peptidase-4 inhibitors to a representative routine population: a systematic review protocol

Introduction Participants in randomised controlled trials (trials) are generally younger and healthier than many individuals encountered in clinical practice. Consequently, the applicability of trial findings is often uncertain. To address this, results from trials can be calibrated to more representative data sources. In a network meta-analysis, using a novel approach which allows the inclusion of trials whether or not individual-level participant data (IPD) is available, we will calibrate trials for three drug classes (sodium glucose cotransporter 2 (SGLT2) inhibitors, glucagon-like peptide-1 (GLP1) receptor analogues and dipeptidyl peptidase-4 (DPP4) inhibitors) to the Scottish diabetes register. Methods and analysis Medline and EMBASE databases, the US clinical trials registry (clinicaltrials.gov) and the Chinese Clinical Trial Registry (chictr.org.cn) will be searched from 1 January 2002. Two independent reviewers will apply eligibility criteria to identify trials for inclusion. Included trials will be phase 3 or 4 trials of SGLT2 inhibitors, GLP1 receptor analogues or DPP4 inhibitors, with placebo or active comparators, in participants with type 2 diabetes, with at least one of glycaemic control, change in body weight or major adverse cardiovascular event as outcomes. Unregistered trials will be excluded. We have identified a target population from the population-based Scottish diabetes register. The chosen cohort comprises people in Scotland with type 2 diabetes who either (1) require further treatment due to poor glycaemic control where any of the three drug classes may be suitable, or (2) who have adequate glycaemic control but are already on one of the three drug classes of interest or insulin. Ethics and dissemination Ethical approval for IPD use was obtained from the University of Glasgow MVLS College Ethics Committee (Project: 200160070). The Scottish diabetes register has approval from the Scottish A Research Ethics Committee (11/AL/0225) and operates with Public Benefit and Privacy Panel for Health and Social Care approval (1617-0147). PROSPERO registration number CRD42020184174.


Aim:
1) To identify a clinically appropriate target population within the Scottish diabetes register for calibration modelling of a large network meta-analysis of glucose lowering drugs 2) Document the variables to be collected and summarised within the identified population

Background
For the proposed calibration modelling to be clinically relevant, the routine data target population to which the models are applied requires to be clearly set out and clinically justifiable. Using a 2019 extract of the SCI-diabetes database we aim to identify a population of people with type 2 diabetes mellitus where prescription of any of the three drug classes of interest (Sodium Glucose Co-Transporter 2 Inhibitors (SGLT2i) /Glucagon-Like Peptide 1 Receptor Agonists (GLP1ra) / Dipeptidyl Peptidase-4 Inhibitors (DPP4i)) would realistically be considered should the individual require treatment escalation. We aim to exclude anyone who would be considered to have a significant contraindication to any of the three drug classes. Subsequent work will include clustering to identify more specific subsets of the population e.g., based on age, sex, body weight, renal function, cardiovascular risk. This will allow calibration to more specific subsets of the overall target population. This will be described in a later document.

Pilot work
We conducted some exploratory searches of the 2017 extract of SCI-diabetes to help guide this protocol. We identified those within the register who were prescribed at least one of the drug classes of interest. Overall, we identified 56,867 people on at least one of the target drugs. (Mean age 64.65 years, weight 98.14kg, glycated haemoglobin (HbA1c) 66.97mmol/mol, systolic blood pressure 136.55mmHg, estimated glomerular filtration rate (eGFR) 58.45. ml/min/1.73m 2 ).

Proposed steps
1) Access the 2019 data extract-and familiarise with datasets available within and data included in each. 2) Limit included participants to those where absolute contraindications for proposed drug classes are absent (see specific exclusions). Table 1 where available. 4) Continuous observations for each individual will be taken as the mean of measurements over 3 years prior to 1/1/19. The most recent measurement in last 3 years will be taken for categorical variables e.g. smoking. 5) If all of the following variables are missing for the last 3 years, we will presume likely that the individual has either moved away or is not engaged with clinical services and they will not be included: HbA1c, systolic blood pressure, diastolic blood pressure, smoking status, fasting plasma glucose, urinary albumin creatinine ratio, total cholesterol, high density lipoprotein cholesterol, low density lipoprotein cholesterol, eGFR and body mass index. 6) Previous comorbidities/prescriptions will be extracted as present if appear in previous 10 years of data. 7) Comorbidity data will be defined using ICD10 codes within the linked SMR01 dataset (specified below). As per large cardiovascular outcome trials e.g. CANVAS 1 , history of cardiovascular disease will be defined as history of atherosclerotic cardiovascular disease including coronary, cerebrovascular or peripheral vascular disease. 8) Comorbidity data from SMR01/prescribing data will be included where the comorbidity appears in any position in the discharge data e.g., primary diagnosis or any other position of diagnosis 9) Create preliminary definition of overall population to be used for calibration based on above which may include modification of variables collected based on availability. 10) Provide summary statistics including number included/excluded to steering committee and (PRIOR to performing calibration or running NMA model on trial data) amend target population protocol on basis of feedback. -Must have documented diagnosis of type 2 diabetes mellitus within the derived diagnosis variable in dataset diagnosed before or on 1/1/18 -There will be no limits to the duration of diabetes diagnosis -Those with diabetes in remission will be excluded when HbA1c limit applied. Glycaemic control:

Proposed population defining characteristics
-No limit to HbA1c at diagnosis -Limit population to those with most recent HbA1c ≥53mmol/mol or those with HbA1c <53mmol/mol but currently on one of the three drug classes of interest, or insulin.

Body Mass Index:
-Limit to those with most recent BMI measurement to ≥23.5kg/m 2 (use cleaned variable either from clinician entered variable from Sci Diabetes, or derived from weight/height) -More specific BMI groupings will likely be considered within the clustering subsets.
-Provide summary data to the steering committee regarding those who would be excluded should the BMI cutoff be changed to 20 or 25 kg/m 2 -Current drugs: -It will be permissible for those within the target population to be on one or two of the three target drug classes as excluding these people is likely to unfavourably skew the target population. -It will also be permissible to be taking other glucose lowering drugs including insulin, metformin, sulfonylureas, thiazolidinediones, alpha glucosidase inhibitors. -There will be no limits on non-diabetes drugs including antihypertensives, ACEi/ARB, statins or antiplatelets -There will be a limit on high dose oral steroids-exclude if currently on ≥prednisolone 5mg or equivalent (BNF codes 1.5.2, 6.3.2, 10.1.2) as of 1/1/19 Renal function: -Limit to those with eGFR >30 ml/min/1.73m 2 (derived CKD EPI variable from within Diabepi).
-Exclude if current renal replacement therapy (linked Renal Registry Data within Diabepi) Cardiovascular disease/risk: -There will be no limit on prior cardiovascular disease, including heart failure, or cardiovascular risk factors e.g., smoking, dyslipidaemia at this stage. -These factors will be considered further in the subset clustering -ICD 10 codes for CV disease include coronary disease, cerebrovascular ischaemic disease, unspecified cerebral infarction, unspecified atherosclerosis, and peripheral vascular disease. 'ADALIMUMAB', 'GOLIMUMAB', 'VEDOLIZUMAB', 'TOFACITINIB', 'USTEKINUMAB') plus ≥ 1 outpatient appointment at Gastroenterology within past 3 years based on SMR00 coding (Specialty= A9, attendance status= 1 (seen)). Whilst this will not identify those with milder disease in the community, and may include people with other diagnoses in error, in practical terms it will likely identify and exclude those with more severe disease in whom incretin therapies would be contraindicated.
-Gastroparesis: Whilst we intended to exclude for history of gastroparesis, there is no ICD10 code specific enough for this therefore it was not possible on the available data. -Recent diagnosis of cancer (based on record in the smr06 cancer register database in last 3 years). -End of Life: Exclude, based on SMR01 data, if admission from or discharge to a hospice at any time (location code =62), admission under palliative care (spec=AM), admission reason palliative care or geriatric palliative care (admreas=1M/4B) or admission to palliative care facility (sigfac=1G), as treatment unlikely to be appropriate

Variables of interest within target population
Aggregate descriptive characteristics from the target population will be gathered to facilitate trial outcome calibration in the next stage of this project.

Section 3: Statistical methods
Models will be fitted using the multilevel network meta-regression framework described by Phillippo et al 2 , which we outline here.
IPD studies provide outcomes and a vector of covariates for each individual in study receiving treatment . The individual-level model for these data is: where Ind (⋅) is a suitable likelihood distribution. (⋅) a suitable link function, which transforms the expected outcome for an individual conditional on their covariates onto the linear predictor .
are study-specific intercepts, 1 and 2, correspond to the effects of covariates and covariate-treatment interactions respectively, and is the individual-level treatment effect of treatment compared to a chosen network reference treatment 1.
Aggregate studies provide aggregate outcomes • on treatment in study , and a joint distribution for the covariates ( ). The aggregate-level model for these data is constructed by integrating the individual-level model over the population in each study: where Agg (⋅) is a suitable likelihood distribution, • is the expected outcome on treatment in study , and is the support of the covariates. The integral is evaluated using efficient quasi-Monte Carlo numerical integration, with a sample of points ̃; from the joint distribution ( ): • ≈ −1 ∑ −1 ( (̃; )) ; .
The joint distribution of covariates ( ) is rarely available directly from study publications; instead, marginal summaries are available (e.g. means and standard deviations, proportions). However, under assumptions about the forms of the marginal distributions and the correlation structure (for example based on those observed in the IPD studies), the full joint distribution can be reconstructed. In practice, results are seen to be robust to misspecification of these assumptions 3 .
In a Bayesian framework, prior distributions will be placed on each of the model parameters , 1 , 2, , . Random effects models and unrelated mean effects or node-splitting models will also be fitted within the above framework, to explore heterogeneity and inconsistency respectively. 2 After model fitting, population-average treatment effects ( ) between any two treatments and , in a population with mean covariate values ̅ ( ) , can be obtained as  where for each subgroup the integration points from the full joint distribution are partitioned into each subgroup as ̃; . This approach is only appropriate for independent subgroups (e.g. levels of a single covariate, or subgroups of multiple covariates reported factorially). For non-independent subgroups (e.g. multiple single-covariate subgroup analyses), this approach will be extended to account for the resulting correlations in the likelihood.